Generalized Statistical Methods for Mixed Exponential Families, Part I: Theoretical Foundations

نویسندگان

  • Cécile Levasseur
  • Kenneth Kreutz-Delgado
  • Uwe F. Mayer
چکیده

This work considers the problem of learning the underlying statistical structure of multidimensional data of mixed probability distribution types (continuous and discrete) for the purpose of fitting a generative model and making decisions in a data-driven manner. Using properties of exponential family distributions and generalizing classical linear statistics techniques, a unified theoretical model called Generalized Linear Statistics (GLS) is established. The methodology exploits the split between data space and natural parameter space for exponential family distributions and solves a nonlinear problem by using classical linear statistical tools applied to data that have been mapped into the parameter space. The framework is equivalent to a computationally tractable, mixed data-type hierarchical Bayes graphical model assumption with latent variables constrained to a low-dimensional parameter subspace. We demonstrate that exponential family Principal Component Analysis, Semi-Parametric exponential family Principal Component Analysis, and Bregman soft clustering are not separate unrelated algorithms, but different manifestations of model assumptions and parameter choices taken within this common GLS framework. We readily extend these algorithms to deal with the important mixed data-type case. We study in detail the extreme case corresponding to exponential family Principal Component Analysis and solve problems related to fitting the generative model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation in Simple Step-Stress Model for the Marshall-Olkin Generalized Exponential Distribution under Type-I Censoring

This paper considers the simple step-stress model from the Marshall-Olkin generalized exponential distribution when there is time constraint on the duration of the experiment. The maximum likelihood equations for estimating the parameters assuming a cumulative exposure model with lifetimes as the distributed Marshall Olkin generalized exponential are derived. The likelihood equations do not lea...

متن کامل

Generalized Statistical Methods for Mixed Exponential Families, Part II: Applications

This work considers the problem of both supervised and unsupervised classification for vector data of mixed types. An important subclass of graphical modeling techniques called Generalized Linear Statistics (GLS) is used to capture the underlying statistical structure of these complex data. The GLS methodology exploits the split between data space and natural parameter space for exponential fam...

متن کامل

Graphical Models, Exponential Families, and Variational Inference

The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimizati...

متن کامل

An EM Algorithm for Estimating the Parameters of the Generalized Exponential Distribution under Unified Hybrid Censored Data

The unified hybrid censoring is a mixture of generalized Type-I and Type-II hybrid censoring schemes. This article presents the statistical inferences on Generalized Exponential Distribution parameters when the data are obtained from the unified hybrid censoring scheme. It is observed that the maximum likelihood estimators can not be derived in closed form. The EM algorithm for computing the ma...

متن کامل

Statistical Modeling for Oblique Collision of Nano and Micro Droplets in Plasma Spray Processes

  Spreading and coating of nano and micro droplets on solid surfaces is important in a wide variety of applications including plasma spray coating, ink jet printing, DNA synthesis and etc. In spraying processes, most of droplets collide obliquely to the surface. The purpose of this article is to study the distribution of nano and micro droplets spreading when droplets impact at an oblique a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009